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Nonnarametric  estimators  for  the  hazard  functions  in  an  additive  risk 
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model  for  counting  processes  are  studied.  -We  establish' a  functional  central 
limit  theorem  for  the  integrated  estimators  and  show  how  this  can  be  used  to 
find  the  asymptotic  null  distribution  of  a  maximal  deviation  statistic  for 

r  ’  ■  f 

Kolmogorov-Smimov  type  testing.  In  addition,  -we-  provide  confidence  bands 
for  approximations  to  the  integrated  hazard  functions  and  show  that  certain 
smoothed  versions  of  th,e  hazard  function  estimators  are  uniformly  consistent. 
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1.  Introduction. 


The  proportional  hazards  regression  model  of  Cox  (1972)  for  the  analysis  of 
censored  survival  data  has  had  considerable  influence  on  the  theory  and  practice 
of  biostatistics.  This  model  can  be  described  as  follows.  Let  A(t)  =  A{t;Y(t)} 
denote  the  "failure"  rate  at  time  t  for  a  subject  with  covariate  history 
Y(t)  =  (YjCt) , . . .  ,Yp(t))  i.e.  A(t)dt  is  the  probability  that  the  subject  dies 

in  the  small  time  interval  from  t  to  t  +  dt.  Cox's  model  assumes  that  A(t)  has 
the  multiplicative  form 


A(t)  =  an(t)exp{  l  e-Y.(t)}, 


(1.1) 


where  8^,  ...»  8p  are  constants  to  be  estimated  and  oQ  is  an  unknown  baseline 
hazard  rate. 

An  alternative  model,  introduced  by  Aalen  (1980),  is  the  additive  risk  model 


given  by 


X(t)  -  I  «.(t)Y.(t), 
j  =  l  3  3 


(1.?.) 


where  cij,  ...,  ap  are  unknown  "hazard"  functions.  Although  statistical  methods 
for  Cox's  model  are  well  developed  (e.g.  Andersen  and  Gill,  1982),  this  has  not 
been  so  for  Aalen's  model,  except  in  the  case  of  one  covariate  (Aalen,  1978).  The 
object  of  the  present  paper  is  to  provide  techniques  which  can  deal  with  any  number 
of  covariates  in  the  Aalen  model. 

Aalen's  model  provides  a  useful  alternative  to  the  Cox  model  when  the  sample 
size  is  large  and  more  detailed  information  concerning  the  influence  of  each 
covariate  is  needed.  It  could  be  used  in  epidemiologic  studies  involving  large 
cohorts  of  individuals,  such  as  those  studies  cited  by  Breslow  (1986,  pp.  110,111), 
and  in  situations  where  the  proportional  hazards  assumption  of  the  Cox  model  is 
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poorly  satisfied.  Methods  for  assessing  the  goodness  of  fit  of  the  Cox  model  are 
discussed  by  Arjas  (1986). 

Aalen  (1975,1978,1980)  formulated  his  model  in  the  framework  of  counting 

processes  as  follows.  Let  X(t) ,  t  e  [0,1]  be  a  counting  process  which  counts 

observed  events  in  the  life  of  the  subject  observed  over  the  time  interval  [0,1]. 

So  the  sample  paths  of  X  are  step  functions,  zero  at  time  zero,  right-continuous 

with  unit  jumps.  We  assume  that  X  has  random  intensity  process  A(t)  given  by  (1.2), 

where  Y^,  ...,  are  predictable  covariate  processes.  By  "intensity"  we  mean 

that  the  process  M(t)  =  X(t)  -  JtA(s)ds  is  a  square  integrable  martingale.  Writing 

0 

X  in  differential  form,  dX(t)  =  A(t)dt  +  dM(t),  we  may  regard  dM(t)  as  additive 
"noise."  The  basic  model  equation  is 

X(t)  =  jVsJds  +  M(t),  (1.3) 

0 

where  A  is  given  by  (1.2). 

The  statistical  problem  is  one  of  estimating  the  hazard  functions  o^,  .... 
on  the  basis  of  n  iid  copies  of  X,  Yj,  ...,  Y^  observed  over  [0,1]. 

Our  initial  approach  is  to  treat  the  hazard  functions  as  though 

they  are  piecewise  constant  functions.  This  reduces  the  problem  to  the  estimation 
of  finitely  many  parameters.  It  would  be  possible  to  use  the  method  of  maximum 
likelihood  at  this  stage,  as  has  been  done  by  Buckley  (1984).  However,  we  are 
interested  in  obtaining  asymptotic  results  which  do  not  implicitly  assume  that  the 
hazard  functions  are  of  a  piecewise  constant  form.  Under  these  circumstances  the 
mathematics  of  maximum  likelihood  becomes  intractable,  except  in  the  case  of  one 
covariate  (Karr,  1983).  A  way  out  of  this  difficulty  is  to  use  a  quasi-least-squares 
estimator  for  the  parameters  in  the  piecewise  constant  hazard  functions.  Quasi- 
least-squares  estimators  for  this  model  were  introduced  by  McKeague  (1986a),  and  in 
a  related  model  by  Christopeit  (1986)  and  Le  Breton  and  Musiela  (1986). 


The  final  step  is  to  allow  the  mesh  size  of  the  piecewise  constant  functions 
to  tend  to  zero  at  a  rate  tied  to  the  sample  size  n,  obtaining  asymptotic  results 


as  n  +  «.  This  amounts  to  a  use  of  the  method  of  sieves  (Grenander,  1981) 


-(n) 


involving  the  so-called  histogram  sieve.  Let  '  (t)  denote  the  estimator  of  cu 


obtained  in  this  way.  We  shall  obtain  a  functional  central  limit  theorem  for 

(t)  =  (s)  ds .  This  result  is  used  to  find  the  asymptotic  distribution  of 


J  ' "  6  j 

a  maximal  deviation  statistic  for  testing  the  hypothesis  H^:  cu  =  aQ,  where  aQ  is 
a  known  function.  In  addition,  we  shall  provide  confidence  bands  for  the  best 


.1 

approximation  to  the  integrated  hazard  function  A.  (t)  =  J  ct.(s)ds  within  the 

1  n  1 


histogram  sieve  and  provide  a  uniformly  consistent  estimator  for  . 


Our  results  are  not  restricted  to  counting  processes.  Indeed,  processes 
satisfying  equation  (1.3)  and  a  technical  condition  known  as  left-quasi-continuity 


are  included.  We  have  previously  treated  the  functional  central  limit  theorem  for 


*00 


Aj  J  in  the  case  that  X  is  a  continuous  process  (McKeague,  1986b)  using  orthogonal 
series  sieves.  An  L2-  consistency  result  for  a^n-*  was  established  in  McKeague  (1986a) 


2.  Histogram  sieve  estimators. 

(J2,F,P)  will  denote  a  complete  probability  space  and  (Ft ,te [0,1])  a  nondecreasing 
right-continuous  family  of  sub-o-fields  of  F  where  Fq  contains  all  P-null  sets  in  F. 
All  processes  are  indexed  by  t  e  [0,1].  The  process  M  =  (M(t),Ft)  is  assumed  to 
be  a  square  integrable  martingale  such  that  almost  all  paths  of  M  are  right-continuous 
on  [0,1)  with  left  limits  on  (0,1],  Write  AMt  =  Mt  -  M^_,  the  jump  in  M  at  time  t. 

A  stochastic  process  X  =  (X(t) ,te [0,1])  is  said  to  be  left-quasi-continuous  (see 
Dellacherie,  1972,  p.  85)  if  for  all  predictable  stopping  times  x  taking  values 
in  [0,1],  XT  =  XT  almost  surely.  Here  a  stopping  time  x  is  said  to  be  predictable 
if  there  is  an  increasing  sequence  (x  )  of  stopping  times  such  that  xn  <  x  a.s. 
for  all  n  £  1  and  x^  +  x  a.s.  If  X  is  a  counting  process  with  a  continuous 


3 


yvxy.y. 


Write  A^(t)  =  /  a^n^(s)ds.  A  function  f  is  said  to  be  Lipschitz  of  order  y, 

3  0  3 

where  0  <  y  <  1,  if  there  is  a  constant  C  such  that  for  all  s,  t  in  the  domain  of  f, 

|f(t)-f(s) |  ^  C|t-s|Y.  If  f  is  Lipschitz  of  order  1  we  simply  say  it  is  Lipschitz. 

We  first  state  our  functional  central  limit  theorem  for  A^Ct)  =  /  af^(s)ds 

3  0  3 

in  the  counting  process  case. 


Theorem  2.1.  Suppose  that  X  is  a  counting  process,  the  histogram  sieve  is  used, 

1„ 

conditions  (Cl)  -  (C4)  (stated  below)  hold  and  dn  +  °°,  dn  =  o(n2).  Then 


(A^-A^)  — *■  m,  in  C[0,1], 


1  3 


where  nu  is  a  continuous  Gaussian  martingale  with  mean  zero  and  covariance  function 

•SAt  -  p  - 

Cov(m.  (s)  ,m.  (t))  =  L^(u){  \  cu  (u)E  [Y .  (u)  Y,  (u)  ]  }du, 

0 

where  L^(u)  is  the  jth  element  of  diag[K-1 (u) ] ,  in  which  K_1(u)  is  the  inverse  of 
the  p»p  matrix  K(u)  having  components  Kr^(u)  =  E [Yr(u) Y^(u) ] . 


Conditions. 


(Cl)  ou  is  Lipschitz  of  order  y  >  h  for  j  =  1,  ...,  p. 

(C2)  inf  EY?(t)  >0  for  j  =  1 ,  . . . ,  0. 

te [0,1]  3 


sup  1 EY . (t)  Y, (t) 
t€[0,l]  J  k 


inf  EY?(t)l  52 
t€[0,l]  3  J 


inf 

te[0,l 


eyJ  (t)l  ** 


for  all  1  <  j  <  k  ^  p,  applicable  for  p  >  2. 


(C4)  sup  E | Y .  (t) 1 5  <  »  for  j  =  1,  ...,  p. 


Remarks. 


(i)  In  the  special  case  of  a  single  covariate  Cp=l)  we  have  that  the 

.  .  A  .  1  ..  1 n.  Ii/l  rt  f  4>  Vt  +  ^  mn  f  ^  f  t  ^  1C 


asymptotic  variance  of  the  estimator  Aj-  }  (t)  is 


Em2(t)  =  J 


t  Oj  (s)EY^ (s) 


0  [EYj(s)]2 


(2.5) 


However,  a  natural  estimator  for  in  the  case  p =  1  is  the  well  known 
Nelson-Aalen  estimator  given  by 

(2.6) 

1  0  Y^(s) 

where  X^  (t)  =  £  X.  (t)  and  Y^(t)  =  £  Y..(t).  Here  1/0  is  defined  to  be  zero. 

i=l  i=l  11 

Aalen  (1978)  showed  that  n/n(A^n^- Aj)  converges  weakly  in  the  Skorohod  space  D[0,1] 
to  a  continuous  Gaussian  martingale  mQ  such  that 

2  ft  ai(s) 

Em0(t)  ■/  W^Ts Tds-  (2-7) 

2  2 

It  is  interesting  to  compare  (2.5)  and  (2.7).  Note  that  EmQ(t)  s  Em^t)  by  Holder's 
inequality.  Aalen  (1980)  has  extended  A^  to  give  estimators  A^n\  ...,  A^  of 
Aj,  ...»  Ap  when  p  >  1,  however  it  has  not  been  possible  to  obtain  consistency  or 
asymptotic  normality  results  for  these  estimators. 

(ii)  McKeague  (1986b)  showed  that  (C3)  implies  K(t)  is  non-singular  for 
almost  every  t,  so  the  covariance  function  of  in  the  statement  of  Theorem  2.1  is 
well  defined.  It  seems  reasonable  to  conjecture  that  Theorem  2.1  remains  true 
when  the  quite  restrictive  condition  (C3)  is  replaced  by  the  condition  that  the 
minimum  eigenvalue  of  K(t)  is  bounded  away  from  zero.  We  could  have  replaced  (C3) 
by  a  weak  condition  of  this  kind,  but  to  avoid  technical  distractions  we  have 
refrained  from  doing  so.  From  the  point  of  view  of  applications  it  would  be  safe 
to  disregard  condition  (C3) . 


Theorem  2.1  is  a  consequence  of  the  following  more  general  result.  <M>  denotes 

the  predictable  quadratic  variation  process  of  M,  i.e.  the  unique  increasing  predic- 

2 

table  process  such  that  Mt  -  <M>t  is  a  martingale. 


Theorem  2.2.  Suppose  that  X  is  a  left-quasi-continuous  process,  the  histogram 


sieve  is  used,  conditions  (Cl)  -  (C3) ,  (Dl)  -  (D5)  (stated  below)  hold  and  d^ + ®, 
dn  =  o(n^).  Then 

/iT(^n)  -A*n})  -£■  m.  in  C [0, 1] , 

3  3  3  * 

where  nu  is  a  continuous  Gaussian  martingale  with  mean  zero  and  covariance  function 

,SAt  2  2 

Cov(m.  (s)  ,m.  (t) )  =  E  /  L.  (u)  Y .  (u)d<M> 

3  3  0  3  3 

where  is  defined  in  the  statement  of  Theorem  2.1. 

Conditions 

(Dl)  sup  EY^(t)  <  «  for  j  =  1,  ...,  p. 
te[0,l]  3 

(D2)  The  predictable  variation  process  <M>  has  absolutely  continuous 


sample  paths  (a.s.)  and  there  exists  y  >  1  such  that 
~  [d<M>t  Y" 

sun  E  |  Y.  (t)  |  y  -r- —  <  ®  ,  for  j  =  1,  ...,  p. 

t£ [0,1]  [  3  dt  ) 


(D3)  The  process  \  (AM  )  and  its  compensator  have  finite  second 
0<s^t 

moments  at  t  =  1. 

(D4)  E(  l  Y? (s) (AM  ) 2)  <  »  for  j  =  1,  ...,  p. 

0<s<l  3 

(D5)  The  function 

6  (t)  =  E[/V(s)dir  ],  t  e  [0,1] 

J  0  3 

v  4 

where  ir  =  (ir  )  is  the  compensator  of  the  process  \  (AM  ) 


0<s<t 


is  Lipschitz  for  j  =  1, 


7 


wji -jr-i ->  \-  v 


3.  The  maximal  deviation  statistic. 


a  f  rj')  a  (”  ri 

Define  pxp  matrices  Kv  J  and  ;  by 


and  let  £jn^  (t)  be  the  j**1  element  of  the  diagonal  of  a  generalized  inverse  of  K^n 
In  order  to  test  the  hypothesis  H^:  cu  =  otg,  where  otg  is  a  known  function,  we 
propose  the  use  of  the  statistic 

,  ,  (  ,  A^n)(t)  -  A^n}(t) 

T^  =  /iig^d)1  SUP  - -p — r -  , 

3  3  te  [0,1]  (1)  ♦  6jn)  (t) 

where  A^ft)  =  /  aj^Cslds,  is  the  best  L 2 -approximation  to  aQ  from  within 

Sa  and 
n 

fi{n)(t)  =  /t[£{n)(s)]2{o0(s)fiS)(s)  +  I  Jn)(s)M^)(s)}ds. 

3  0  3  u  JJ  k*j  K  3 

The  following  result  gives  the  asymptotic  null  distribution  of  tP1^. 


Theorem  3.1,  Suppose  that  the  conditions  of  Theorem  2.1  are  satisfied,  the  sample 

paths  of  Y,  are  left  continuous  with  right  hand  limits,  inf  |det  K(t) |  >  0  and 
k  te [0,1] 

for  k  =  1,  . . . ,  p, 

E[  sup  |Yk(t) | 3]  <  “  .  (3 

te  [0,1]  K 


Then  if  HQ  holds 


lim  P{T.n)>c  }  =  a, 
3  ® 

n-*»  J 


where  0<a<l.  c  is  the  upper  a  quantile  of  the  distribution  of  sup  |B  (t) 

te[0,J*] 

and  Bu  is  the  Brownian  bridge  process. 
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K’ 


*--  V  V  ^ y-  •’-  TTTTTTTTrry  1  j  ■>’>">  M  wyyw<gvy 


Using  arguments  in  the  proof  of  Theorem  3.1  it  can  be  checked  that  TV 


(n) 


,(n) 


provides  a  consistent  test  against  all  alternatives  in  the  sense  that  TV  '  •*■<*>  a.s. 


under  any  alternative.  A  table  for  the  distribution  of  sup  |B  (t)  |  has  been 

t€[0,%] 


given  by  Hall  and  Wellner  (1980).  For  instance  c  qj.  =  1.273. 


Confidence  bands. 


Under  the  condition 


v^T  sup 
te [0,1]  3 


A,(t)-Ajnl(t) 


(4.1) 


,  (n) 


Theorem  2.1  holds  with  A^  replaced  by  A. .  So  under  (4.1)  we  may  obtain  confi¬ 


dence  bands  for  A^ .  Unfortunately  the  class  of  functions  for  which  (4.1) 


holds  with  dn  =  o(n^)  is  too  small  to  be  of  practical  interest.  We  are  obliged 


to  make  do  with  confidence  bands  for  A^. 


(n) 


2  A  Tn  1 

First  estimate  G.,  (t)  =  Em.(t)  by  GV  (t) ,  as  in  section  3  but  with  aQ  replaced 


by  aV 


(n) 


J 


Then  under  the  conditions  of  Theorem  3.1  we  see  that  an  asymptotic 


,  (n) 


_00(l-a)%  confidence  band  for  AV  ’  has  upper  and  lower  limits  given  by 


g(n)(t) 

Xf’w  t  tWdlV.i-)  ,  t  <  [0,1] 


(1) 


where  c  is  defined  in  the  statement  of  Theorem  3.1. 
a 


5.  Kernel  estimators  for  the  hazard  functions. 


The  functions  of  real  interest  are  the  hazard  functions  a.,  _ ,  a  rather 

1  ’  P 


than  the  integrated  hazard  functions  A^ , 


, ,  A  .  It  would  be  difficult  to 
P 


obtain  an  adequate  picture  of  by  visually  assessing  the  gradient  of  A^ 


The  estimator  ajn^  itself  has  only  been  shown  to  be  consistent  in  the  L^-sense: 

/  [afn^ (t)-a.(t)]*"dt  S  0,  see  McKeague  (1986a),  and  experience  shows  that  it 
0  ]  ] 


has  very  rough  pathwise  behaviour.  However,  in  the  following  theorem  we  are 


-V*Vi* 
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able  to  show  that  a  smoothed  version  of  provides  a  uniformly  consistent 

estimator  of  cu .  This  result  was  obtained  under  the  assumption  that  X  is  a 
continuous  process  by  McKeague  (1986b,  Theorem  3.2).  Let  K  be  a  (kernel)  function 
having  integral  1  and  support  [-1,1].  Define 

~(n),^.  1  f*  ,t-s  .  *(n),  ,  , 

<*•  (t)  =  z—  J  K(-r— )  ai  (s)ds> 

3  °n  0  n  3 


where  b^  >  0  is  a  bandwidth  parameter. 


Theorem  5.1.  Suppose  that  X  is  a  counting  process,  the  histogram  sieve  is  used, 

-8 

conditions  (Cl)  -  (C4)  hold,  and  K  are  Lipschitz,  b^  =  n  where  h  ^  8  <  h 
and  dn  =  [n  ]  where  %(l-8)  £  6  <  %.  Then 

sup  |5^n)(t)  -o.(t)|  =  0  (n"%(1_2B))  . 
te [0,1]  3  3  p 

In  future  work  we  shall  study  the  performance  of  the  above  estimators  and 
test  statistics  using  simulated  data  and  discuss  their  application  to  real  data. 


6.  Proofs. 


The  following  lemma  is  needed  for  the  proof  of  Theorem  2.2. 


Lemma  6 . 1  Suppose  that  M  is  a  left-quasi-continuous  square  integrable  martingale 
satisfying  condition  (D3) .  Let  2  <  q  <  4.  Then  there  exist  constants  Cj,  C0 
such  that  for  each  predictable  process  (Ht,te[0,l])  satisfying 


1  2 

E  J  H  d  <M>r  <  »  , 
0 


(6.1) 


E(  l  H2(AM.)2)  <  »  , 
0<t<l  X 


(6.2) 


the  following  inequality  holds: 


E|/VdM  |q  <  C  E(/1H^d<M>  )q/2  +  C  (E/Vdi  )q/4  , 
0  0  o 

4 

where  ir  =  (u  )  is  the  compensator  of  £  (AM  )  . 

0<s<t 


£ 

Proof.  Decompose  M  into  the  sum  of  a  continuous  martingale  M  and  a  purely  dis¬ 
continuous  martingale  Md  such  that  <Mc,Md>  =  0,  see  Dellacherie  and  Meyer  (1982, 

VIII  43).  <Md>  is  the  compensator  of  the  square  bracket  process  [M^]  =  £  (AM  ) 

d  °<S"t 

which  is  left-quasi-continuous.  Thus  by  Dellacherie  (1972,  p.  Ill),  <M  >  is  contin 
uous.  The  process  Q  =  [M]  -  <M>  =  [Md]  -  <Md>  is  a  square  integrable  martingale 


(by(D3)) , 


[Q]t  =  ([Md]-<Md>]  =  [[Md]]  =  l  (AM  )4, 
*  0<s<t  5 


and  <0>t  =  ir^  where  ir^  is  defined  in  the  statement  of  the  Lemma.  If 
.1  4  .1  4 

Ej  H  d<0>  =  ”  nothing  remains  to  be  proved,  so  assume  that  EJ  H  d<Q>  <  °°.  Then 

0  1  2  0 
the  stochastic  integral  /  H  dQ  is  defined.  Conditions  (6.1)  and  (6.2)  ensure 

1  2  0 

that  E /  H^|dQt|  <  ”,  so  by  Kopp  (1984,  Theorem  4.3.18)  the  stochastic  integral 

0  .1  2 

and  Stieltjes  integral  interpretations  of  J  H  dQ  coincide.  By  the  Burkholder- 

0  z  z 

Davis-Gundy  inequality  (see  Dellacherie  and  Meyer,  1982,  p.  287)  there  exists  a 
constant  C  such  that 

E|/VdM .  |q  S  C  E(/Vd[M]  )q/2 
0  0 


C1E(/1H2d<M>t)q/2  +  C2E|/1H2dQt.|q/2  , 


where  we  have  used  |a+b|  s  2Y  ^(|a|^+|b|Y)  for  y  >  1,  and  Cj  =  =  2  C.  Finally, 


by  Lyapounov's  inequality. 


El/VdQ  |q/2  *  [E(/1H^dQ  )2]q/4 
0  0 

1  4  q/4 

=  [Ej  Htdirt] 


which  completes  the  proof.  D 


Proof  of  Theorem  2.2.  It  is  clear  that  for  each  n  s  1  the  histogram  sieve  esti¬ 
mator  can  be  written  in  the  form  of  the  orthogonal  series  sieve  estimator 

considered  in  McKeague  (1986a,  1986b),  where  the  orthonormal  vectors  r>  r  =  1, 
...,  dn  used  to  define  are  replaced  by 

r  0  otherwise, 

» 

r  =  1,  ...,  dfi.  Now  a  formal  rewriting  of  the  proof  of  Corollary  2.3  and  Theorem 
2.1  of  McKeague  (1986b),  expanding  in  ,~rms  of  ,  r  =  1,  ...,  dn,  shows 

that  the  finite  dimensional  distributions  of  /n  (A^  -a!11^)  converge  weakly  to 

those  of  m ,  the  required  conditions  holding  when  (Cl)  -(C3),  (Dl)  and  (D2)  are 
satisfied.  Moreover,  we  have  the  representation 


,V'-V 

*>Va 


w- 

VvS 

w 

vv, 

w 

feji- 

\-V* 


•;ft: 

>ft: 


^  (Ajn)  (t)  -  Ajn}(t))  =  un(t)  ♦  vn(t)t 


(6.3) 


where  sup  |U  (t)  |  -*• 
t€  [0,1]  11 


V«  =  —  Unim 

*n  i=l 

Z  •  (t)  =  f  J1  Jn)(s,t)Y  .(s)dM  (s) 
m  k=1  0  k  ik  i 

u£°(s,t)  = 

c[X(n)(t)]  =  R(n)_1vec[h(n)(t)] 


‘fi’w  ■ 


[0  t](s)<(,r  Cs)ds  fork  =  j»r  =  1» 


for  k  *  j,  r  =  1, 


where  is  the  pd^ x pdn  matrix  partitioned  into  the  p*  submatrices  R^“-' ,  k,m  =  1, 


.,  p  with  entries 


"klrf-J1  =PfkC«Y.Ct)l. 


»./.  *. 

.*  V  V 

,vV 

■/\V 

? V  V 

w 

.w 

»Vv' 


ft}; 

ft- 

ft? 

>;  • 

VN> 

•  •  I 


•vV 
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The  proof  of  Lemma  4.3  of  McKeague  (1986b)  shows  that  is  invertible  for  all 


nil  and 


where 


sup  ||r^_1||  <  00  > 
n>l 


(6.4) 


denotes  operator  norm.  From  (6.3)  it  remains  to  show  that  the  sequence 


of  processes  (Vn,nil)  is  tight  in  C[0,1].  If  we  can  find  constants  q  i  0,  y  >  1 
and  C  such  that  for  all  nil,  t^,  t 2  e  [0,1] 

ElVn(t2)  “  Vn(tl)|q  "  C|t2‘  tl|Y  (6< 


(6.5) 


then,  by  Billingsley  (1968,  Theorem  12.3),  {V^.nil}  will  be  tight.  In  what  follows 
C  denotes  a  generic  positive  constant  which  is  independent  of  n.  By  the 
Marcinkiewicz-Zygmund  inequality  (see  Chow  and  Teicher,  1978,  p.  356)  for  q  i  1 

EIVV-Wlq  SCE4  .ltZni(V-ZI>i(tl)]2,q/2 


sCE|Z„i(t2)-Znl(V 


*  c  l  E|/  (u^nJ(s,t2)  -u1JnJCs,t1)}Yk(s)dMs 
k=  X  0 


p.  356)  for  q  ^  1 

VW 

-K 

2}q/2 

(6.6) 

"  4 

%  • 

Since  <j>£n^  is  a  bounded  function  so  is  uk"''(,,t)  (for  each  fixed  t)  and  it  follows, 
using  conditions  (D2)  and  (D4),  that  for  fixed  k,  n,  t^,  t^  the  predictable 
process  Hs  =  v(s)Yk(s),  where  v(s)  =  u^n^(s,t2)  -  u^^s,^),  satisfies  conditions 
(6.1),  (6.2)  of  Lemma  6.1.  Thus,  for  2  £  q  s  4 


„(n)  , 


E|i1{ukn)(s»t2)“ukn)(s»V)Yk(s)dMslq  -  C1E(/1v2(s)Yk(s)d<M>s)q/2 

fi  4  4  q/4 

♦  C2(E /  v  (s)Y7(s)dTT  ) 

^  0  K  S 


(6.7) 


Now  choose  y  satisfying  (D2)  and  1  <  y  s  2.  Then  setting  q  =  2y  we  have 


'.■*  V.T.1*.'  V*  V  S.~  V  1*  V*v.' 


Proof  of  Theorem  2.1.  It  suffices  to  check  that  the  conditions  of  Theorem  2.2 


are  satisfied.  By  Liptser  and  Shiryayev  (1978,  p.  243)  the  counting  process  X 

ft 

is  left-quasi-continuous.  (Dl)  is  a  consequence  of  (C4) .  Since  <M>  =  J  X(s)ds  = 

.t  E  0 

J  (  l  a. (s)Y. (s))ds,  (D2)  holds  with  y  =  2  as  a  consequence  of  (Cl)  and  (C4) .  In 

0  3=1  J  3  _  2 

the  counting  process  case  \  (AM  )  coincides  with  X(t)  which  has  compensator 
t  0<sst 

/  X(s)ds  so  that  (D3)  is  implied  by  (Cl)  and  (C4).  For  condition  (D4), 

0 

E(  l  Y^(s)(AM  I  )2)  =  E(/1Y^(s)dX(s)) 

0<s<l  3  s  0  ^ 


1  2  Pi  o 

=  E (/  Y  (s)X(s)ds)  =  l  J  a.  (s)E [Yf  (s)Y,  (s)  ]ds  < 
0  3  k=l  0  K  3  K 


by  (Cl)  and  (C4) .  □ 


Proof  of  Theorem  3.1.  Condition  (3.1)  implies  that 


E(  sup  |  Y.  (t) Y.  (t)  | )  <  00  (6.9) 

te[0,l]  3  k 

and 

E(  sup  |Y?(t)Y  (t)|)  <  -  ,  (6.10) 

te[0,l]  3  k 


for  j,  k  =  1,  ...,  p  using  Holder's  inequality.  Applying  the  strong  law  of  large 
numbers  in  D[0,1]  (Ranga  Rao,  1963,  Theorem  1)  in  the  reversed  time  direction, 
given  (6.9)  and  (6.10)  it  follows  that  for  all  j,  k  =  1,  . ..,  p 


sup  |K^  (t)  -  K  .  (t)  |  ±1'  0  (6.11) 

te [0,1]  3K  3K 


(U1U  ,  N  9  C 

sup  |ftS3(t)-M  ..(t)  I  -U*  0,  (6.12) 

te  [0,1]  3k  3k 

where  M^(t)  =  E[Yj(t)Y^(t)] .  By  (6.11)  and  the  condition  on  detK(t),  (t) 
is  nonsingular  for  all  t  e  [0,1]  for  n  sufficiently  large  a.s.  so  by  the 


componentwise  continuity  of  the  matrix  inverse  operation  we  have 


sup  |L{n)(t)-L.(t)|  0. 

te [0,1]  J  3 


(6.13) 


Let  Gj (t)  =  Emf(t),  given  in  the  statement  of  Theorem  2.1.  Under  the  conditions 
of  the  Theorem, 


/  Ct)  -a.(t)] 
0  J  J 


2dt  0, 


(6.14) 


from  a  histogram  sieve  version  of  Theorem  2.1  of  McKeague  (1986a).  Using  (6.12)- 
(6.14)  and  the  Cauchy-Schwarz  inequality  it  is  then  easy  to  show  that 


sup  |fi{n)(t)-G.(t)|  5  0. 
te[0,lj  3  J 


(6.15) 


Zn(t)  =  /K  G.(l) 


%  ^(n)(t)-Ajn)(t) 
G  (1)+G  (t) 

_  J  J 


and  note  that  by  Theorem  2.1  ZR  converges  weakly  in  C[0,1]  to  the  process 
n  G.(t)  0 

Bu  f  G  (t)  )  *  t  6  t0’1!’  where  B  is  the  Brownian  bridge  process.  By  the 

continuous  mapping  theorem  (Billingsley,  1968,  Theorem  5.1)  sup  | C t) |  converges 

Q  te[0,l] 

in  distribution  to  sup  | B  (t) | .  But 

te[o,y 


sup  Sn 
te[0,l] 


,  j  r  (t)  (t) 

8j  L5‘n)(i)*G!n)(t)  J "  Zn< 


GjB) (l)^(Gj (1)+Gj (t)) 


<  sup  !  Z  (t)  |  sup  - *  ,  . ™ - 

te [0,1]  n  te  [0, 1]  Gj (l)*5  (fijn) (1) +Gjn^ (t) ) 


which  tends  to  0  in  probability  since  the  first  term  is  tight  and  (6.15)  shows 
that  the  second  term  tends  to  zero  in  probability.  The  result  now  follows  from 
Theorem  4.1  of  Billingsley  (1968).  □ 


Proof  of  Theorem  5.1. 


Define  the  following  smoothed  versions  of  ou  and  ou 


a{n)(t)  =i/1K(^l)a.(s)ds, 
J  Dn  0  °n  J 


aj(n)(t)  ■  J  «]n}(s)ds. 

Since  K  is  Lipschitz  it  is  of  bounded  variation.  Denote  its  total  variation  by 
V(K)  .  Then  (c.f.  the  proof  of  Theorem  4.1.2.  of  Ramlau-Hansen(1983) ) , 

sup  |s{n)(t)-o?(n)(t)|  =  sup  |^-/1K(^.)d(A{n)-Ajn})Cs)| 
te[0,l]  ^  ^  te[0,l)  n  0  n  ^  ^ 

if  V(K)  sup  |A^n)  (s)  -A^n}  (s)  | 
bn  S£[0,1]  3  J 


since 


-  * 

P  bn/n 

{^n)-Ajn}),  nsl}  is  tight  in  C[0, 1J  by  Theorem  2.1.  Next 


(6.16) 


a?(n)(t)-5{n)(t)|  =  /*K(^-)  [c.Jm(s)-o  (s)]ds 

J  3  n  0  Dn  J  J 


1  r*  t-s  .  r„{n}. 


5  vr  [J1!^2^)^]35  [/1[a{n}(s)-«,(s)]2ds]  • 


n  0  n 


0<^,0(i)  ■ 


(6.17) 


since  eu  is  Lipschitz.  The  Lipschitz  assumption  on  cu  can  also  be  used  to  show 


te  [0,1] 


Combining  (6.16)  -  (6.18)  we  obtain 


sup  | ct^n^  (t) -a.  (t)  |  =  0(b). 
rn  n  J  3  n 


(6.18) 


ms 


W-'j 


i 


rv^jyj^jVY^V’  v  v  **  o  H"  W  »v  r.  t.  *i  *V  V.*.**.  '<■.  #.'  AJ'J  ■J"Jl’ J  ■  VVJ'J'J  'J  «J  «1  M  ■  V  ■ ,'.  ■  V  V.'*J.  «J.«  L*  m  IV.^.VI 


sup  |a{n)(t)-o.(t)  |  - 
t€[0,l]  3  3 


OpCn6'55) 


0(n 


*iB-6, 


0(n'6) 


0p(n 


-%(l-26) 


), 


&  < 


since  HB  - 
This  completes 


HB  -  Htt-B)  =  -Hi  1-26)  and  -8  <;  8  -  H  =  -Hi  1-2B)  when  8  > 
the  proof  of  the  theorem.  0 
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